Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support max_runtime in scheduler #6785

Merged
merged 2 commits into from
Dec 22, 2023
Merged

Support max_runtime in scheduler #6785

merged 2 commits into from
Dec 22, 2023

Conversation

berland
Copy link
Contributor

@berland berland commented Dec 12, 2023

Issue
Resolves #6784

Approach
Short description of the approach

(Screenshot of new behavior in GUI if applicable)

Pre review checklist

  • Read through the code changes carefully after finishing work
  • Make sure tests pass locally (after every commit!)
  • Prepare changes in small commits for more convenient review (optional)
  • PR title captures the intent of the changes, and is fitting for release notes.
  • Updated documentation
  • Ensured that unit tests are added for all new behavior (See
    Ground Rules),
    and changes to existing code have good test coverage.

Pre merge checklist

  • Added appropriate release note label
  • Commit history is consistent and clean, in line with the contribution guidelines.

@berland berland force-pushed the max_runtime branch 2 times, most recently from 9cfcd4d to 56b2ed2 Compare December 12, 2023 08:40
@berland berland marked this pull request as ready for review December 12, 2023 08:40
@codecov-commenter
Copy link

codecov-commenter commented Dec 12, 2023

Codecov Report

All modified and coverable lines are covered by tests ✅

Comparison is base (0140ccc) 83.90% compared to head (b22ce4f) 83.91%.
Report is 1 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #6785      +/-   ##
==========================================
+ Coverage   83.90%   83.91%   +0.01%     
==========================================
  Files         365      365              
  Lines       21410    21428      +18     
  Branches      948      948              
==========================================
+ Hits        17964    17982      +18     
  Misses       3152     3152              
  Partials      294      294              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@berland berland self-assigned this Dec 12, 2023
@berland berland added scheduler release-notes:skip If there should be no mention of this in release notes labels Dec 12, 2023
self.real = real
self.started = asyncio.Event()
self.returncode: asyncio.Future[int] = asyncio.Future()
self.aborted = asyncio.Event()
self._scheduler = scheduler
self._callback_timeout: Optional[Callable[[int], None]] = callback_timeout
Copy link
Contributor

@xjules xjules Dec 12, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this function needed and do we need this function as an attribute?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The timeout is set up by _legacy.py and we need to keep in sync with how the existing job_queue operates.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we can have a

if isinstance(self._job_queue, Scheduler):

in _legacy.py? I reckon that is less dirty than having this callback_timeout.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed!

src/ert/scheduler/job.py Outdated Show resolved Hide resolved
self.real = real
self.started = asyncio.Event()
self.returncode: asyncio.Future[int] = asyncio.Future()
self.aborted = asyncio.Event()
self._scheduler = scheduler
self._callback_timeout: Optional[Callable[[int], None]] = callback_timeout
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we can have a

if isinstance(self._job_queue, Scheduler):

in _legacy.py? I reckon that is less dirty than having this callback_timeout.

src/ert/scheduler/job.py Outdated Show resolved Hide resolved
src/ert/scheduler/job.py Outdated Show resolved Hide resolved
@berland
Copy link
Contributor Author

berland commented Dec 14, 2023

I added a commit that refactored the PR to use asyncio.wait_for, but I am not sure I liked that alternative better than the _max_runtime_task.

threadsafer_future_wait(self.returncode),
timeout=self.real.max_runtime,
)
except asyncio.exceptions.TimeoutError:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

asyncio.TimeoutError

src/ert/scheduler/job.py Outdated Show resolved Hide resolved
@xjules
Copy link
Contributor

xjules commented Dec 15, 2023

I added a commit that refactored the PR to use asyncio.wait_for, but I am not sure I liked that alternative better than the _max_runtime_task.

When studying both approaches, I'm bit more inclined (since it cleaner tbh) towards having a separate task that just asyncio.sleeps and cancel jobs in the end. Also wait_for might be just another version of the same approach anyway.

@berland berland force-pushed the max_runtime branch 8 times, most recently from ab0c709 to 445d9d1 Compare December 22, 2023 11:36
This highlights a behavioural change in the new LocalDriver, it will
not send the same events as the legacy local driver, see
test_async_queue_execution.py::test_happy_path

The new scheduler will not catch bare exceptions for now, and
thus the test for that situation is only applied for the legacy
JobQueue.
Copy link
Contributor

@pinkwah pinkwah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎉

@berland berland merged commit cf03ba6 into equinor:main Dec 22, 2023
46 checks passed
@berland berland deleted the max_runtime branch September 17, 2024 07:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
release-notes:skip If there should be no mention of this in release notes
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

Add support for MAX_RUNTIME in scheduler
4 participants